nlp_architect.models package

Subpackages

Submodules

nlp_architect.models.bist_parser module

class nlp_architect.models.bist_parser.BISTModel(activation='tanh', lstm_layers=2, lstm_dims=125, pos_dims=25)[source]

Bases: object

BIST parser model class. This class handles training, prediction, loading and saving of a BIST parser model. After the model is initialized, it accepts a CoNLL formatted dataset as input, and learns to output dependencies for new input.

Parameters:
  • activation (str, optional) – Activation function to use.
  • lstm_layers (int, optional) – Number of LSTM layers to use.
  • lstm_dims (int, optional) – Number of LSTM dimensions to use.
  • pos_dims (int, optional) – Number of part-of-speech embedding dimensions to use.
model

The underlying LSTM model.

Type:MSTParserLSTM
params

Additional parameters and resources for the model.

Type:tuple
options

User model options.

Type:dict
fit(dataset, epochs=10, dev=None)[source]

Trains a BIST model on an annotated dataset in CoNLL file format.

Parameters:
  • dataset (str) – Path to input dataset for training, formatted in CoNLL/U format.
  • epochs (int, optional) – Number of learning iterations.
  • dev (str, optional) – Path to development dataset for conducting evaluations.
load(path)[source]

Loads and initializes a BIST model from file.

predict(dataset, evaluate=False)[source]

Runs inference with the BIST model on a dataset in CoNLL file format.

Parameters:
  • dataset (str) – Path to input CoNLL file.
  • evaluate (bool, optional) – Write prediction and evaluation files to dataset’s folder.
Returns:

The list of input sentences with predicted dependencies attached.

Return type:

res (list of list of ConllEntry)

predict_conll(dataset)[source]

Runs inference with the BIST model on a dataset in CoNLL object format.

Parameters:dataset (list of list of ConllEntry) – Input in the form of ConllEntry objects.
Returns:The list of input sentences with predicted dependencies attached.
Return type:res (list of list of ConllEntry)
save(path)[source]

Saves the BIST model to file.

nlp_architect.models.chunker module

class nlp_architect.models.chunker.SequenceChunker(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence Chunker model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of chunk labels

class nlp_architect.models.chunker.SequencePOSTagger(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence POS tagger model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of POS labels

class nlp_architect.models.chunker.SequenceTagger(use_cudnn=False)[source]

Bases: object

A sequence tagging model for POS and Chunks written in Tensorflow (and Keras) based on the paper ‘Deep multi-task learning with low level tasks supervised at lower layers’. The model has 3 Bi-LSTM layers and outputs POS and Chunk tags.

Parameters:use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)
build(vocabulary_size, num_pos_labels, num_chunk_labels, char_vocab_size=None, max_word_len=25, feature_size=100, dropout=0.5, classifier='softmax', optimizer=None)[source]

Build a chunker/POS model

Parameters:
  • vocabulary_size (int) – the size of the input vocabulary
  • num_pos_labels (int) – the size of of POS labels
  • num_chunk_labels (int) – the sie of chunk labels
  • char_vocab_size (int, optional) – character vocabulary size
  • max_word_len (int, optional) – max characters in a word
  • feature_size (int, optional) – feature size - determines the embedding/LSTM layer hidden state size
  • dropout (float, optional) – dropout rate
  • classifier (str, optional) – classifier layer, ‘softmax’ for softmax or ‘crf’ for conditional random fields classifier. default is ‘softmax’.
  • optimizer (tensorflow.python.training.optimizer.Optimizer, optional) – optimizer, if None will use default SGD (paper setup)
fit(x, y, batch_size=1, epochs=1, validation_data=None, callbacks=None)[source]

Fit provided X and Y on built model

Parameters:
  • x – x samples
  • y – y samples
  • batch_size (int, optional) – batch size per sample
  • epochs (int, optional) – number of epochs to run before ending training process
  • validation_data (optional) – x and y samples to validate at the end of the epoch
  • callbacks (optional) – additional callbacks to run with fitting
load(filepath)[source]

Load model from disk

Parameters:filepath (str) – file name of model
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of pos and chunk labels

save(filepath)[source]

Save the model to disk

Parameters:filepath (str) – file name to save model

nlp_architect.models.cross_doc_sieves module

nlp_architect.models.cross_doc_sieves.run_entity_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on Entity mentions :param topics: The Topics (with mentions) to evaluate :param resources: (SievesContainerInitialization) resources for running the evaluation

Returns:List of topics and mentions with predicted cross doc coref within each topic
Return type:Clusters
nlp_architect.models.cross_doc_sieves.run_event_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on event mentions :param topics: The Topics (with mentions) to evaluate :param resources: resources for running the evaluation

Returns:List of clusters and mentions with predicted cross doc coref within each topic
Return type:Clusters

nlp_architect.models.crossling_emb module

class nlp_architect.models.crossling_emb.Discriminator(input_data, Y, lr_ph)[source]

Bases: object

build_train_graph(disc_pred)[source]

Builds training graph for discriminator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.Generator(src_ten, tgt_ten, emb_dim, batch_size, smooth_val, lr_ph, beta, vocab_size)[source]

Bases: object

build_train_graph(disc_pred)[source]

Builds training graph for generator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.WordTranslator(hparams, src_vec, tgt_vec, vocab_size)[source]

Bases: object

Main network which does cross-lingual embeddings training

apply_procrustes(sess, final_pairs)[source]

Applies procrustes to W matrix for better mapping :param sess: Tensorflow Session :type sess: tf.session :param final_pairs: Array of pairs which are mutual neighbors :type final_pairs: ndarray

generate_xling_embed(sess, src_dict, tgt_dict, tgt_vec)[source]

Generates cross lingual embeddings :param sess: Tensorflow session :type sess: tf.session

static report_metrics(iters, n_words_proc, disc_cost_acc, tic)[source]

Reports metrics of how training is going

run(sess, local_lr)[source]

Runs whole GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_discriminator(sess, local_lr)[source]

Runs discriminator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_generator(sess, local_lr)[source]

Runs generator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

Returns:Returns number of words processed
save_model(save_model, sess)[source]

Saves W in mapper as numpy array based on CSLS criterion :param save_model: Save model if True :type save_model: bool :param sess: Tensorflow Session :type sess: tf.session

static set_lr(local_lr, drop_lr)[source]

Drops learning rate based on CSLS criterion :param local_lr: Learning Rate :type local_lr: float :param drop_lr: Drop learning rate by 2 if True :type drop_lr: bool

nlp_architect.models.gnmt_model module

GNMT attention sequence-to-sequence model with dynamic RNN support.

class nlp_architect.models.gnmt_model.GNMTModel(hparams, mode, iterator, source_vocab_table, target_vocab_table, reverse_target_vocab_table=None, scope=None, extra_args=None)[source]

Bases: nlp_architect.models.gnmt.attention_model.AttentionModel

Sequence-to-sequence dynamic model with GNMT attention architecture with sparsity policy support.

nlp_architect.models.intent_extraction module

class nlp_architect.models.intent_extraction.IntentExtractionModel[source]

Bases: object

Intent Extraction model base class (using tf.keras)

fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:
  • x – input samples
  • y – input sample labels
  • epochs (int, optional) – number of epochs to train
  • batch_size (int, optional) – batch size
  • callbacks (Callback, optional) – Keras compatible callbacks
  • validation (list of numpy.ndarray, optional) – optional validation data to be evaluated when training
input_shape

Get input shape

Type:tuple
load(path)[source]

Load a trained model

Parameters:path (str) – path to model file
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:
  • x – samples to run through the model
  • batch_size (int, optional) – batch size:
Returns:

predicted values by the model

Return type:

numpy.ndarray

save(path, exclude=None)[source]

Save model to path

Parameters:
  • path (str) – path to save model
  • exclude (list, optional) – a list of object fields to exclude when saving
class nlp_architect.models.intent_extraction.MultiTaskIntentModel(use_cudnn=False)[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Multi-Task Intent and Slot tagging model (using tf.keras)

Parameters:use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)
build(word_length, num_labels, num_intent_labels, word_vocab_size, char_vocab_size, word_emb_dims=100, char_emb_dims=30, char_lstm_dims=30, tagger_lstm_dims=100, dropout=0.2)[source]

Build a model

Parameters:
  • word_length (int) – max word length (in characters)
  • num_labels (int) – number of slot labels
  • num_intent_labels (int) – number of intent classes
  • word_vocab_size (int) – word vocabulary size
  • char_vocab_size (int) – character vocabulary size
  • word_emb_dims (int, optional) – word embedding dimensions
  • char_emb_dims (int, optional) – character embedding dimensions
  • char_lstm_dims (int, optional) – character feature LSTM hidden size
  • tagger_lstm_dims (int, optional) – tagger LSTM hidden size
  • dropout (float, optional) – dropout rate
save(path)[source]

Save model to path

Parameters:path (str) – path to save model
class nlp_architect.models.intent_extraction.Seq2SeqIntentModel[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Encoder Decoder Deep LSTM Tagger Model (using tf.keras)

build(vocab_size, tag_labels, token_emb_size=100, encoder_depth=1, decoder_depth=1, lstm_hidden_size=100, encoder_dropout=0.5, decoder_dropout=0.5)[source]

Build the model

Parameters:
  • vocab_size (int) – vocabulary size
  • tag_labels (int) – number of tag labels
  • token_emb_size (int, optional) – token embedding vector size
  • encoder_depth (int, optional) – number of encoder LSTM layers
  • decoder_depth (int, optional) – number of decoder LSTM layers
  • lstm_hidden_size (int, optional) – LSTM layers hidden size
  • encoder_dropout (float, optional) – encoder dropout
  • decoder_dropout (float, optional) – decoder dropout

nlp_architect.models.matchlstm_ansptr module

class nlp_architect.models.matchlstm_ansptr.MatchLSTMAnswerPointer(params_dict, embeddings)[source]

Bases: object

Defines end to end MatchLSTM and Answer_Pointer network for Reading Comprehension

answer_pointer_pass()[source]

Function to run the answer pointer pass:

Parameters:None
Returns:List of logits for start and end indices of the answer
cal_f1_score(ground_truths, predictions)[source]

Function to calculate F-1 and EM scores

Parameters:
  • ground_truths – labels given in the dataset
  • predictions – logits predicted by the network
Returns:

F1 score and Exact-Match score

create_model()[source]

Function to set up the end 2 end reading comprehension model

create_variables()[source]

Function to create variables used for training

get_dynamic_feed_params(question_str, vocab_reverse)[source]

Function to get required feed_dict format for user entered questions. Used mainly in the demo mode.

Parameters:
  • question_str – question string
  • vocab_reverse – vocab dictionary with words as keys and indices as values
Returns:

list of indicies represnting the question padded to max length question_len: actual length of the question ques_mask: mask for question_idx

Return type:

question_idx

inference_mode(session, valid, vocab_tuple, num_examples, dropout=1.0, dynamic_question_mode=False, dynamic_usr_question='', dynamic_question_index=0)[source]

Function to run inference_mode for reading comprehension

Parameters:
  • session – tensorflow session
  • valid – data dictionary for validation set
  • vocab_tuple – a tuple containing voacab dictionaries in forward and reverse directions
  • num_examples – specify the number of samples to run for inference
  • dropout – Float value which is always 1.0 for inference
  • dynamic_question_mode – boolean to enable whether or not accept questions from the user(used in the demo mode)
static obtain_indices(preds_start, preds_end)[source]

Function to get answer indices given the predictions

Parameters:
  • preds_start – predicted start indices
  • predictions – predicted end indices
Returns:

final start and end indices for the answer

run_loop(session, train, mode='train', dropout=0.6)[source]

Function to run training/validation loop and display training loss, F1 & EM scores

Parameters:
  • session – tensorflow session
  • train – data dictionary for training/validation
  • dropout – float value
  • mode – ‘train’/’val’
unroll_with_attention(reverse=False)[source]

Function to run the match_lstm pass in both forward and reverse directions

Args: reverse: Boolean indicating whether to unroll in reverse directions

nlp_architect.models.memn2n_dialogue module

class nlp_architect.models.memn2n_dialogue.MemN2N_Dialog(batch_size, vocab_size, sentence_size, memory_size, embedding_size, num_cands, max_cand_len, hops=3, max_grad_norm=40.0, nonlin=None, initializer=<tensorflow.python.ops.init_ops.RandomNormal object>, optimizer=<tensorflow.python.training.adam.AdamOptimizer object>, session=<tensorflow.python.client.session.Session object>, name='MemN2N_Dialog')[source]

Bases: object

End-To-End Memory Network.

batch_fit(stories, queries, answers, cands)[source]

Runs the training algorithm over the passed batch

Parameters:
  • stories – Tensor (None, memory_size, sentence_size)
  • queries – Tensor (None, sentence_size)
  • answers – Tensor (None, vocab_size)
Returns:

floating-point number, the loss computed for the batch

Return type:

loss

predict(stories, queries, cands)[source]

Predicts answers as one-hot encoding.

Parameters:
  • stories – Tensor (None, memory_size, sentence_size)
  • queries – Tensor (None, sentence_size)
Returns:

Tensor (None, vocab_size)

Return type:

answers

nlp_architect.models.memn2n_dialogue.zero_nil_slot(t)[source]

Overwrites the nil_slot (first row) of the input Tensor with zeros.

The nil_slot is a dummy slot and should not be trained and influence the training algorithm.

nlp_architect.models.most_common_word_sense module

class nlp_architect.models.most_common_word_sense.MostCommonWordSense(epochs, batch_size, callback_args=None)[source]

Bases: object

build(input_dim)[source]
eval(valid_set)[source]
fit(train_set)[source]
get_outputs(valid_set)[source]
load(model_path)[source]
save(save_path)[source]

nlp_architect.models.ner_crf module

class nlp_architect.models.ner_crf.NERCRF(use_cudnn=False)[source]

Bases: object

Bi-LSTM NER model with CRF classification layer (tf.keras model)

Parameters:use_cudnn (bool, optional) – use cudnn LSTM cells
build(word_length, target_label_dims, word_vocab_size, char_vocab_size, word_embedding_dims=100, char_embedding_dims=16, tagger_lstm_dims=200, dropout=0.5)[source]

Build a NERCRF model

Parameters:
  • word_length (int) – max word length in characters
  • target_label_dims (int) – number of entity labels (for classification)
  • word_vocab_size (int) – word vocabulary size
  • char_vocab_size (int) – character vocabulary size
  • word_embedding_dims (int) – word embedding dimensions
  • char_embedding_dims (int) – character embedding dimensions
  • tagger_lstm_dims (int) – word tagger LSTM output dimensions
  • dropout (float) – dropout rate
fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:
  • x (numpy.ndarray or numpy.ndarray) – input samples
  • y (numpy.ndarray) – input sample labels
  • epochs (int, optional) – number of epochs to train
  • batch_size (int, optional) – batch size
  • callbacks (Callback, optional) – Keras compatible callbacks
  • validation (list of numpy.ndarray, optional) – optional validation data to be evaluated when training
load(path)[source]

Load model weights

Parameters:path (str) – path to load model from
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:
  • x (numpy.ndarray or numpy.ndarray) – input samples
  • batch_size (int, optional) – batch size
Returns:

predicted values by the model

Return type:

numpy.ndarray

save(path)[source]

Save model to path

Parameters:path (str) – path to save model weights

nlp_architect.models.np2vec module

class nlp_architect.models.np2vec.NP2vec(corpus, corpus_format='txt', mark_char='_', word_embedding_type='word2vec', sg=0, size=100, window=10, alpha=0.025, min_alpha=0.0001, min_count=5, sample=1e-05, workers=20, hs=0, negative=25, cbow_mean=1, iterations=15, min_n=3, max_n=6, word_ngrams=1, prune_non_np=True)[source]

Bases: object

Initialize the np2vec model, train it, save it and load it.

is_marked(s)[source]

Check if a string is marked.

Parameters:s (str) – string to check
classmethod load(np2vec_model_file, binary=False, word_ngrams=0, word2vec_format=True)[source]

Load the np2vec model.

Parameters:
  • np2vec_model_file (str) – the file containing the np2vec model to load
  • binary (bool) – boolean indicating whether the np2vec model to load is in binary format
  • word_ngrams (int {1,0}) – If 1, np2vec model to load uses word vectors with subword (
  • information. (ngrams)) –
  • word2vec_format (bool) – boolean indicating whether the model to load has been stored in
  • word2vec format. (original) –
Returns:

np2vec model to load

save(np2vec_model_file='np2vec.model', binary=False, word2vec_format=True)[source]

Save the np2vec model.

Parameters:
  • np2vec_model_file (str) – the file containing the np2vec model to load
  • binary (bool) – boolean indicating whether the np2vec model to load is in binary format
  • word2vec_format (bool) – boolean indicating whether to save the model in original
  • format. (word2vec) –

nlp_architect.models.np_semantic_segmentation module

class nlp_architect.models.np_semantic_segmentation.NpSemanticSegClassifier(num_epochs, callback_args, loss='binary_crossentropy', optimizer='adam', batch_size=128)[source]

Bases: object

NP Semantic Segmentation classifier model (based on tf.Keras framework).

Parameters:
  • num_epochs (int) – number of epochs to train the model
  • **callback_args (dict) – callback args keyword arguments to init a Callback for the model
  • loss – the model’s cost function. Default is ‘tf.keras.losses.binary_crossentropy’ loss
  • optimizer (tf.keras.optimizers) – the model’s optimizer. Default is ‘adam’
build(input_dim)[source]

Build the model’s layers :param input_dim: the first layer’s input_dim :type input_dim: int

eval(test_set)[source]

Evaluate the model’s test_set on error_rate, test_accuracy_rate and precision_recall_rate

Parameters:test_set (numpy.ndarray) – The test set
Returns:loss, binary_accuracy, precision, recall and f1 measures
Return type:tuple(float)
fit(train_set)[source]

Train and fit the model on the datasets

Parameters:
  • train_set (numpy.ndarray) – The train set
  • args – callback_args and epochs from ArgParser input
get_outputs(test_set)[source]

Classify the dataset on the model

Parameters:test_set (numpy.ndarray) – The test set
Returns:model’s predictions
Return type:list(numpy.ndarray)
load(model_path)[source]

Load pre-trained model’s .h5 file to NpSemanticSegClassifier object

Parameters:model_path (str) – local path for loading the model
save(model_path)[source]

Save the model’s prm file in model_path location

Parameters:model_path (str) – local path for saving the model
nlp_architect.models.np_semantic_segmentation.f1(y_true, y_pred)[source]
Parameters:
  • y_true
  • y_pred

Returns:

nlp_architect.models.np_semantic_segmentation.precision_score(y_true, y_pred)[source]

Precision metric.

Only computes a batch-wise average of precision.

Computes the precision, a metric for multi-label classification of how many selected items are relevant.

nlp_architect.models.np_semantic_segmentation.recall_score(y_true, y_pred)[source]

Recall metric.

Only computes a batch-wise average of recall.

Computes the recall, a metric for multi-label classification of how many relevant items are selected.

nlp_architect.models.pretrained_models module

class nlp_architect.models.pretrained_models.AbsaModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained ABSA model

files = ['rerank_model.h5']
sub_path = 'models/absa/'
class nlp_architect.models.pretrained_models.BistModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained BIST model

files = ['bist-pretrained.zip']
sub_path = 'models/dep_parse/'
class nlp_architect.models.pretrained_models.ChunkerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Chunker model

files = ['model.h5', 'model_info.dat.params']
sub_path = 'models/chunker/'
class nlp_architect.models.pretrained_models.IntentModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Intent model

files = ['model_info.dat', 'model.h5']
sub_path = 'models/intent/'
class nlp_architect.models.pretrained_models.MrcModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained MRC model

files = ['mrc_data.zip', 'mrc_model.zip']
sub_path = 'models/mrc/'
class nlp_architect.models.pretrained_models.NerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained NER model

files = ['model_v4.h5', 'model_info_v4.dat']
sub_path = 'models/ner/'
class nlp_architect.models.pretrained_models.PretrainedModel(model_name, sub_path, files)[source]

Bases: object

Generic class to download the pre-trained models

Usage Example:

chunker = ChunkerModel.get_instance() chunker2 = ChunkerModel.get_instance() print(chunker, chunker2) print(“Local File path = “, chunker.get_file_path()) files_models = chunker2.get_model_files() for idx, file_name in enumerate(files_models):

print(str(idx) + “: ” + file_name)
get_file_path()[source]

Return local file path of downloaded model files

classmethod get_instance()[source]

Static instance access method :param cls: Calling class :type cls: Class name

get_model_files()[source]

Return individual file names of downloaded models

nlp_architect.models.supervised_sentiment module

nlp_architect.models.supervised_sentiment.one_hot_cnn(dense_out, max_len=300, frame='small')[source]

Temporal CNN Model

As defined in “Text Understanding from Scratch” by Zhang, LeCun 2015 https://arxiv.org/pdf/1502.01710v4.pdf This model is a series of 1D CNNs, with a maxpooling and fully connected layers. The frame sizes may either be large or small.

Parameters:
  • dense_out (int) – size out the output dense layer, this is the number of classes
  • max_len (int) – length of the input text
  • frame (str) – frame size, either large or small
Returns:

temporal CNN model

Return type:

model (model)

nlp_architect.models.supervised_sentiment.simple_lstm(max_features, dense_out, input_length, embed_dim=256, lstm_out=140, dropout=0.5)[source]

Simple Bi-direction LSTM Model in Keras

Single layer bi-directional lstm with recurrent dropout and a fully connected layer

Parameters:
  • max_features (int) – vocabulary size
  • dense_out (int) – size out the output dense layer, this is the number of classes
  • input_length (int) – length of the input text
  • embed_dim (int) – internal embedding size used in the lstm
  • lstm_out (int) – size of the bi-directional output layer
  • dropout (float) – value for recurrent dropout, between 0 and 1
Returns:

LSTM model

Return type:

model (model)

nlp_architect.models.tagging module

class nlp_architect.models.tagging.InputFeatures(input_ids, char_ids, mask=None, label_id=None)[source]

Bases: object

A single set of features of data.

class nlp_architect.models.tagging.NeuralTagger(embedder_model, word_vocab: nlp_architect.utils.text.Vocabulary, labels: List[str] = None, use_crf: bool = False, device: str = 'cpu', n_gpus=0)[source]

Bases: nlp_architect.models.TrainableModel

Simple neural tagging model Supports pytorch embedder models, multi-gpu training, KD from teacher models

Parameters:
  • embedder_model – pytorch embedder model (valid nn.Module model)
  • word_vocab (Vocabulary) – word vocabulary
  • labels (List, optional) – list of labels. Defaults to None
  • use_crf (bool, optional) – use CRF a the classifier (instead of Softmax). Defaults to False.
  • device (str, optional) – device backend. Defatuls to ‘cpu’.
  • n_gpus (int, optional) – number of gpus. Default to 0.
static batch_mapper(batch)[source]

Map batch to correct input names

convert_to_tensors(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], max_seq_length: int = 128, max_word_length: int = 12, pad_id: int = 0, labels_pad_id: int = 0, include_labels: bool = True) → torch.utils.data.dataset.TensorDataset[source]

Convert examples to valid tagger dataset

Parameters:
  • examples (List[TokenClsInputExample]) – List of examples
  • max_seq_length (int, optional) – max words per sentence. Defaults to 128.
  • max_word_length (int, optional) – max characters in a word. Defaults to 12.
  • pad_id (int, optional) – padding int id. Defaults to 0.
  • labels_pad_id (int, optional) – labels padding id. Defaults to 0.
  • include_labels (bool, optional) – include labels in dataset. Defaults to True.
Returns:

TensorDataset for given examples

Return type:

TensorDataset

evaluate(data_set: torch.utils.data.dataloader.DataLoader)[source]

Run evaluation on given dataloader

Parameters:data_set (DataLoader) – a data loader to run evaluation on
Returns:logits, labels (if labels are given)
evaluate_predictions(logits, label_ids)[source]

Evaluate given logits on truth labels

Parameters:
  • logits – logits of model
  • label_ids – truth label ids
Returns:

dictionary containing P/R/F1 metrics

Return type:

dict

extract_labels(label_ids, logits)[source]
get_logits(batch)[source]

get model logits from given input

get_optimizer(opt_fn=None, lr: int = 0.001)[source]

Get default optimizer

Parameters:lr (int, optional) – learning rate. Defaults to 0.001.
Returns:optimizer
Return type:torch.optim.Optimizer
inference(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], batch_size: int = 64)[source]

Do inference on given examples

Parameters:
  • examples (List[TokenClsInputExample]) – examples
  • batch_size (int, optional) – batch size. Defaults to 64.
Returns:

a list of tuples of tokens, tags predicted by model

Return type:

List(tuple)

classmethod load_model(model_path: str)[source]

Load a tagger model from given path

Parameters:
  • model_path (str) – model path
  • NeuralTagger – tagger model loaded from path
save_model(output_dir: str)[source]

Save model to path

Parameters:output_dir (str) – output directory
to(device='cpu', n_gpus=0)[source]

Put model on given device

Parameters:
  • device (str, optional) – device backend. Defaults to ‘cpu’.
  • n_gpus (int, optional) – number of gpus. Defaults to 0.
train(train_data_set: torch.utils.data.dataloader.DataLoader, dev_data_set: torch.utils.data.dataloader.DataLoader = None, test_data_set: torch.utils.data.dataloader.DataLoader = None, epochs: int = 3, batch_size: int = 8, optimizer=None, max_grad_norm: float = 5.0, logging_steps: int = 50, save_steps: int = 100, save_path: str = None, distiller: nlp_architect.nn.torch.distillation.TeacherStudentDistill = None)[source]

Train a tagging model

Parameters:
  • train_data_set (DataLoader) – train examples dataloader. If distiller object is
  • train examples should contain a tuple of student/teacher data examples. (provided) –
  • dev_data_set (DataLoader, optional) – dev examples dataloader. Defaults to None.
  • test_data_set (DataLoader, optional) – test examples dataloader. Defaults to None.
  • epochs (int, optional) – num of epochs to train. Defaults to 3.
  • batch_size (int, optional) – batch size. Defaults to 8.
  • optimizer (fn, optional) – optimizer function. Defaults to default model optimizer.
  • max_grad_norm (float, optional) – max gradient norm. Defaults to 5.0.
  • logging_steps (int, optional) – number of steps between logging. Defaults to 50.
  • save_steps (int, optional) – number of steps between model saves. Defaults to 100.
  • save_path (str, optional) – model output path. Defaults to None.
  • distiller (TeacherStudentDistill, optional) – KD model for training the model using
  • teacher model. Defaults to None. (a) –

nlp_architect.models.temporal_convolutional_network module

class nlp_architect.models.temporal_convolutional_network.CommonLayers[source]

Bases: object

Class that contains the common layers for language modeling -
word embeddings and projection layer
define_input_layer(input_placeholder_tokens, word_embeddings, embeddings_trainable=True)[source]

Define the input word embedding layer :param input_placeholder_tokens: tf.placeholder, input to the model :param word_embeddings: numpy array (optional), to initialize the embeddings with :param embeddings_trainable: boolean, whether or not to train the embedding table

Returns:Embeddings corresponding to the data in input placeholder
define_projection_layer(prediction, tied_weights=True)[source]

Define the output word embedding layer :param prediction: tf.tensor, the prediction from the model :param tied_weights: boolean, whether or not to tie weights from the input embedding layer

Returns:Probability distribution over vocabulary
class nlp_architect.models.temporal_convolutional_network.TCN(max_len, n_features_in, hidden_sizes, kernel_size=7, dropout=0.2)[source]

Bases: object

This class defines core TCN architecture. This is only the base class, training strategy is not implemented.

build_network_graph(x, last_timepoint=False)[source]

Given the input placeholder x, build the entire TCN graph :param x: Input placeholder :param last_timepoint: Whether or not to select only the last timepoint to output

Returns:output of the TCN
build_train_graph(*args, **kwargs)[source]

Placeholder for defining training losses and metrics

calculate_receptive_field()[source]

Returns:

run(*args, **kwargs)[source]

Placeholder for defining training strategy

class nlp_architect.models.temporal_convolutional_network.WeightNorm(layer, data_init=False, **kwargs)[source]

Bases: tensorflow.python.keras.layers.wrappers.Wrapper

This wrapper reparameterizes a layer by decoupling the weight’s magnitude and direction. This speeds up convergence by improving the conditioning of the optimization problem.

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks: https://arxiv.org/abs/1602.07868 Tim Salimans, Diederik P. Kingma (2016)

WeightNorm wrapper works for keras and tf layers.

```python
net = WeightNorm(tf.keras.layers.Conv2D(2, 2, activation=’relu’),
input_shape=(32, 32, 3), data_init=True)(x)
net = WeightNorm(tf.keras.layers.Conv2D(16, 5, activation=’relu’),
data_init=True)
net = WeightNorm(tf.keras.layers.Dense(120, activation=’relu’),
data_init=True)(net)
net = WeightNorm(tf.keras.layers.Dense(n_classes),
data_init=True)(net)

```

Parameters:
  • layer – a layer instance.
  • data_init – If True use data dependent variable initialization
Raises:
  • ValueError – If not initialized with a Layer instance.
  • ValueError – If Layer does not contain a kernel of weights
  • NotImplementedError – If data_init is True and running graph execution
build(input_shape)[source]

Build Layer

call(inputs)[source]

Call Layer

compute_output_shape(input_shape)[source]

Computes the output shape of the layer.

Assumes that the layer will be built to match that input shape provided.

Parameters:input_shape – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:An input shape tuple.

Module contents

class nlp_architect.models.TrainableModel[source]

Bases: abc.ABC

Base class for a trainable model

convert_to_tensors(*args, **kwargs)[source]

convert any chosen input to valid model format of tensors

get_logits(*args, **kwargs)[source]

get model logits from given input

inference(*args, **kwargs)[source]

run inference

load_model(*args, **kwargs)[source]

load a model

save_model(*args, **kwargs)[source]

save the model

train(*args, **kwargs)[source]

train the model